Assessing the calibration of subdistribution hazard models in discrete time
نویسندگان
چکیده
The generalization performance of a risk prediction model can be evaluated by its calibration, which measures the agreement between predicted and observed outcomes on external validation data. Here, we propose methods for assessing calibration discrete time-to-event models in presence competing risks. Specifically, consider class subdistribution hazard models, directly relate cumulative incidence function one event interest to set covariates. We apply development nosocomial pneumonia. Simulation studies show that are strong tools assessment even scenarios with high censoring rate and/or large number time points. La de généralisation d'un modèle prévision des risques peut être évaluée par sa qui mesure la concordance entre les valeurs prédites et observées dans données externes validation. Les auteurs proposent méthodes pour évaluer modèles discrets durée vie en présence concurrents. Plus précisément, ils considèrent classe à sous-distribution discrète du risque relie directement fonction d'incidence événement un ensemble covariables. appliquent leurs le développement pneumonie nosocomiale. Ils présentent études simulation montrant que sont d'excellents outils l'évaluation même scénarios comportant haut taux censure et/ou nombre points temporels discrets. Over past decade, have become an indispensable tool decision making applied research. Popular examples include diagnosis prognosis health sciences where is used, such as screening therapy decisions (Moons al. 2012b; Liu 2014; Steyerberg, 2019) ecological research, established quantify forecast impact technology (Gibbs, 2011). A key aspect performance. This task, usually performed applying previously derived candidate or more sets independent data, has been subject extensive methodological research 2012a; Steyerberg & Vergouwe, Harrell, 2015; 2016; Alba 2017). As result, strategies investigating discriminatory power (measuring how well separates cases from controls), outcomes) error (quantifying both discrimination aspects) developed (Steyerberg 2010). Alternative techniques additionally involve analytic include, among others, net benefit analysis (Vickers 2016), curve Elkin, 2006) relative utility (Baker 2009; Kerr Janes, aim this article develop outcome. dealt extensively during years, see, example, Henderson Keiding (2005), Witten Tibshirani (2010), Soave Strug (2018) Braun (2018). explicitly assume times measured scale t = 1, 2, … (Ding 2012; Tutz Schmid, Berger 2018), may occur along “competing” events (Fahrmeir Wagenpfeil, 1996; Fine Gray, 1999; Lau Beyersmann 2011; Austin Lee 2018; Schmid Berger, 2020). Scenarios type frequently encountered observational limited fixed follow-up measurements, instance, epidemiology (Andersen 2012). Such study designs do not allow recording exact (continuous) times, so it only known whether (or event) occurred two consecutive at ? 1 at, implying refers special case interval intervals. An important will considered article, duration pneumonia (NP) intensive care patients daily basis (Wolkewitz 2008). NP infections associated increased length hospital stay considerable morbidity mortality, highly relevant build statistical gives valid predictions future patients. data interesting early discrete-time risks found literature 1860s (Nightingale,, 1863, Chapter IX). In Section 7 validate (2020) recent several authors estimators analyzing models. For were proposed Measures (2016), 4. Graphical (not accounting occurrence events) explored Methods cause-specific (a common approach analysis) recently Heyard (2020). Here base assessments F j ( | x ) : P T ? , ? ), denoting first event, covariates, ? { J } random variable indicates out (Fine Klein Andersen, 2005). following, without loss generality defined respectively. popular method derive training fit proportional 1999). designed right-censored here. recommended analysts “whenever focus estimating predicting risks” (Austin 2016). While original Gray (1999) assumed continuous scale, based extension modelling (Berger ? ? ? ? defines one-to-one relationship (see Sections 2 3 details). thus characterized hazards sample approximated respective obtained methodology comprises parts, binary regression: part (presented 4) concerned derivation appropriate plot visualizes hazards. second (Section 5), recalibration used analyze calibration-in-the-large refinement (i.e., bias variation, respectively, hazards) lines Cox (1958) Miller (1993). shown 4 5, weights defining versions (to depicted plot) fitting weighted logistic (giving rise point estimates hypothesis tests refinement). illustrated 6) aforementioned 7). 8 summarizes main findings article. Let Ti Ci i.i.d. n individuals i n. Both variables (random censoring) taking values {1, k}, k natural number. It further mechanism non-informative Ti, sense does depend any parameters (Kalbfleisch Prentice, 2002). longitudinal visits, refer intervals [0, a1), [a1, a2), [ak ?), means [at at) ak ?. period individual under observation denoted ˜ min C is, corresponds true if Ti?Ci otherwise. ? I 0) 1). each experience ith }. accordance (1999), our conditional into account there also 0). should noted inclusion time-varying covariates (6) problems. because log likelihood (7) requires time-dependent up possibly beyond . When external, often unrealistic impossible. particular, (4) cannot written when includes (internal) (Cortese Finding adequate strategy framework remains challenging (Poguntke following predict individual-specific some population. fitted assessed N m N. starting considerations (2018), applies single (J Note specification definition log-likelihood remain case, scenario 1) Equations (5) (7). idea underlying split test G subsets Dg, g G, percentiles ^ y N, (5). Following Hosmer (2013) regression average groups subsequently plotted against empirical hazards, given group-wise frequencies outcome ymt 1. well-calibrated indicated close 45-degree line. Now where, addition interest, observed. becomes (2). To obtain model, define quantities ¯ analogous single-event (11). Unlike however, terms wmt straightforward: problem experiencing continue until they event. Hence, Cm unobserved > determined these would still (2020), therefore probability being equal probabilities. Sort form D1, DG ). Compute using formulas (9) (10), V · estimated (11) step (ii). definition, ? D w |. Plot (using axes). Assess inspecting generated (iv). emphasized constitutes exploratory approach, inspection (v) generally involves subjective impression. Formal next section. graphical checks presented 4, originating (1958). method, was originally order investigate probabilities variable. (13) simple linear placed logits Alternatively, could use other link functions, like probit complementary log–log link. intercept “calibration-in-the-large,” systematically too low < Analogously, slope b “refinement,” enough variation (b 1), much (0 wrong general direction 0, 1993). assess follow suggestions (1993) conduct null hypotheses: (i) H0: overall calibration; (ii) 0|b (iii) 1|a, corrected calibration-in-the-large. section present results numerical experiments evaluate conditions. measuring different rates events, levels censoring, varying points, various forms misspecification. Tdisc, i, according indirect scheme described above grouped resulting categories {5, 10, 15}. latter quantiles pre-estimated 1,000,000 observations. consequence, same boundaries run. Censoring distribution density d s c u + / k, percentage censored observations controlled parameter ? standard, normally distributed xi1, xi2 ? N(0, xi3, xi4 Bin(1, 0.5). All independent, coefficients ? 0 ? ? see (1999). specified three rates, weak, medium strong, degree distribution. More specifically, 0.85 (weak), (medium) 1.25 (strong), Figure S1 Supplementary Material. specifying q {0.2, 0.4, 0.8}. total, resulted × 27 scenarios. analyzed 100 replications 5000 independently drawn each, equally (n 2500 2500), Material illustrates nine 5. seen increasing u. constant u, ratio remained approximately same. 0.2 than 0.8 2. 10 15, almost shown. (a) data-generating whereas (b)–(e) misspecified. plots randomly chosen replication 5 Using Equation (12), weak example ?20.46?. Thus, 20 all study. coincided strongly, regardless events. result correctly model. shows works relatively small (for about 10% observed). dataset (upper right panel â 084 957, perfect calibration. Exemplary 15 Figures S2 S3 Again, suggest nearly exception deviation apparent. boxplots that, average, very (lower panel) variance decreasing contrast, had estimates. presents corresponding P-values conducting (i)–(iii) Throughout scenarios, hypotheses kept (at 5% level). (hypothesis ii) yielded (corresponding negative 10-transformed values). never rejected level. Overall, illustrate properly S4–S7 These largely confirmed previous Although deviated strongly S7 Material), half rejected. results, clearly related intervals, explained fact few later increased. 0.2, fewer four most samples. investigation (b), Regarding no noticeable difference Gompertz those 6.2 shown). (c) samples, deteriorated, measures. mostly below Therefore, high. intercepts (Figure S8 zero. (q 0.2), mean smaller ?1. Accordingly, consistently S11 Material). Rejection again systematic shift higher values. Only good did substantially affect scenario. depicts exemplary fits third source misspecification (d), predictor falsely linear. comparison Material, spread considerably wider around slopes S9 distinctly less (in 0.2). Remarkably, affected throughout S12 demonstrates sufficient On hand, prone poor particularly 0.2. last setting (e) violation (calibration 6), line evident 0.8). (Figures S10 S13 extent. contrast (c), appeared problematic. likely mainly affects estimation inaccurate sum up, (c)–(e) demonstrate sensitive severity issues pronounced incorrect (and censoring), assumption violated small). gain insight, reduced measures, validated pneumonia, nosocomial, hospital-acquired infection units (ICUs). earlier (2006), Wolkewitz (2008), authors. mortality ICUs, determine factors disease. collected prospective cohort five ICUs university hospital, lasting 18 months (from February 2000 July 2001) comprising 1876 ICU least days. infection. Other possible onset (being interest) death discharge alive. Owing design, discrete, basis. over 60 days, 61 61, referred ?61 At patient acquired 158), died, released 1695), administratively 23). Descriptive summary statistics baseline Table (2008). age (centred years), gender patients, simplified acute physiology score (SAPS II), 11 characterizing their stay. SAPS II disease admitted aged years older. calculated 12 routine physiological measurements 24 h, range 163] (Le Gall either admission (on admission) prior (before admission). above. incorporates smooth represented cubic P-splines second-order penalty (fitted R package mgcv). Model conducted benchmark experiment partitions Each partition consisted size 1500 (80%) 376 (20%). summarized follows: significantly acquisition level male gender, intubation admission, (iv) another elective emergency surgery before (vi) cardio/pulmonary neurological apart largest percentiles, (<0.005). Furthermore, showed indicating satisfactory 25 S14 Except values, reveal severe deviations However, bundle makes evaluation rather difficult. Boxplots performing 8. 039 984 0.809 i), 0.518 ii), 0.935 iii). (left 8) indicate tended varied little Importantly, trend simulations S4 S5 comparable characteristics Also note 20. According (right 8), substantial, majority Discrete gained widespread popularity (Tutz 2018). proper increasingly necessary. regard, here constitute new analysis. They consist including refinement. connected approaches (Miller 1993; 2013). scenario, naturally reduces who time. modelling, advantage needs specific Subdistribution practical importance, allows interpretation increasing/decreasing effects target Fine, Young suggested differences functions estimands causal modelling. (2017) alternative kind compared nonparametric work correct specifications, “unfavourable” careful situations rare. evaluations add-on discSurv (Welchowski 2019). contains dataLongSubDist() generate vectors (8) (10). Parameter glm() family binomial() regression. thank Jan fruitful discussions helpful improve manuscript. SIR-3 investigators providing us Support German Research Foundation (DFG), grant SCHM 2966/2-1, gratefully acknowledged. Supporting Information Please note: publisher responsible content functionality supporting information supplied Any queries (other missing content) directed author
منابع مشابه
Time-aggregation effects on the baseline of continuous-time and discrete-time hazard models
In this study we reinvestigate the effect of time-aggregation for discreteand continuous-time hazard models. We reanalyze the results of a previous Monte Carlo study by ter Hofstede and Wedel (1998), in which the effects of time-aggregation on the parameter estimates of hazard models were investigated. Whereas the previous study shows that the estimates of the baseline parameters of continuousa...
متن کاملAssessing the moral hazard impact of mango farmers in Chabahar
ABSTRACT-The agricultural sector encompasses activities that are exposed to diverse risks. Risks in the agricultural sector are unavoidable but manageable. Crop insurance is a management tool in the agricultural sector. Crop insurance is a strategy to cope with the production risks of the agricultural sector and to secure farmers’ income in the future. Mango is a major horticul...
متن کاملthe study of bright and surface discrete cavity solitons dynamics in saturable nonlinear media
امروزه سالیتون ها بعنوان امواج جایگزیده ای که تحت شرایط خاص بدون تغییر شکل در محیط منتشر می-شوند، زمینه مطالعات گسترده ای در حوزه اپتیک غیرخطی هستند. در این راستا توجه به پدیده پراش گسسته، که بعنوان عامل پهن شدگی باریکه نوری در آرایه ای از موجبرهای جفت شده، ظاهر می گردد، ضروری است، زیرا سالیتون های گسسته از خنثی شدن پراش گسسته در این سیستم ها بوسیله عوامل غیرخطی بوجود می آیند. گسستگی سیستم عامل...
Discrete-time Multilevel Hazard Analysis
Combining innovations in hazard modeling with those in multilevel modeling, we develop a method to estimate discrete-time multilevel hazard models. We derive the likelihood of and formulate assumptions for a discrete-time multilevel hazard model with time-varying covariates at two levels. We pay special attention to assumptions justifying the estimation method. Next, we demonstrate file constru...
متن کاملThe importance of censoring in competing risks analysis of the subdistribution hazard
BACKGROUND The analysis of time-to-event data can be complicated by competing risks, which are events that alter the probability of, or completely preclude the occurrence of an event of interest. This is distinct from censoring, which merely prevents us from observing the time at which the event of interest occurs. However, the censoring distribution plays a vital role in the proportional subdi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Canadian journal of statistics
سال: 2021
ISSN: ['0319-5724', '1708-945X']
DOI: https://doi.org/10.1002/cjs.11633